Search CORE

12 research outputs found

Visual Similarity Using Limited Supervision

Author: Brattoli Biagio
Publication venue
Publication date: 01/01/2021
Field of study

The visual world is a conglomeration of objects, scenes, motion, and much more. As humans, we look at the world through our eyes, but we understand it by using our brains. From a young age, humans learn to recognize objects by association, meaning that we link an object or action to the most similar one in our memory to make sense of it. Within the field of Artificial Intelligence, Computer Vision gives machines the ability to see. While digital cameras provide eyes to the machine, Computer Vision develops its brain. To that purpose, Deep Learning has emerged as a very successful tool. This method allows machines to learn solutions to problems directly from the data. On the basis of Deep Learning, computers nowadays can also learn to interpret the visual world. However, the process of learning in machines is very different from ours. In Deep Learning, images and videos are grouped into predefined, artificial categories. However, describing a group of objects, or actions, with a single integer (category) disregards most of its characteristics and pair-wise relationships. To circumvent this, we propose to expand the categorical model by using visual similarity which better mirrors the human approach. Deep Learning requires a large set of manually annotated samples, that form the training set. Retrieving training samples is easy given the endless amount of images and videos available on the internet. However, this also requires manual annotations, which are very costly and laborious to obtain and thus a major bottleneck in modern computer vision. In this thesis, we investigate visual similarity methods to solve image and video classification. In particular, we search for a solution where human super- vision is marginal. We focus on Zero-Shot Learning (ZSL), where only a subset of categories are manually annotated. After studying existing methods in the field, we identify common limitations and propose methods to tackle them. In particular, ZSL image classification is trained using only discriminative supervi- sion, i.e. predefined categories, while ignoring other descriptive characteristics. To tackle this, we propose a new approach to learn shared features, i.e. non- discriminative, thus descriptive characteristics, which improves existing methods by a large margin. However, while ZSL has shown great potential for the task of image classification, for example in case of face recognition, it has performed poorly for video classification. We identify the reasons for the lack of growth in the field and provide a new, powerful baseline. Unfortunately, even if ZSL requires only partial labeled data, it still needs supervision during training. For that reason, we also investigate purely unsuper- vised methods. A successful paradigm is self-supervision: the model is trained using a surrogate task where supervision is automatically provided. The key to self-supervision is the ability of deep learning to transfer the knowledge learned from one task to a new task. The more similar the two tasks are, the more effective the transfer is. Similar to our work on ZSL, we also studied the com- mon limitations of existing self-supervision approaches and proposed a method to overcome them. To improve self-supervised learning, we propose a policy network which controls the parameters of the surrogate task and is trained through reinforcement learning. Finally, we present a real-life application where utilizing visual similarity with limited supervision provides a better solution compared to existing parametric approaches. We analyze the behavior of motor-impaired rodents during a single repeating action for which our method provides an objective similarity of behav- ior, facilitating comparisons across animal subjects and time during recovery

Heidelberger Dokumentenserver

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

Author: Brattoli Biagio
Chalupka Krzysztof
Perona Pietro
Tighe Joseph
Zhdanov Fedor
Publication venue
Publication date: 03/03/2020
Field of study

Trained on large datasets, deep learning (DL) can accurately classify videos into hundreds of diverse classes. However, video data is expensive to annotate. Zero-shot learning (ZSL) proposes one solution to this problem. ZSL trains a model once, and generalizes to new tasks whose classes are not present in the training dataset. We propose the first end-to-end algorithm for ZSL in video classification. Our training procedure builds on insights from recent video classification literature and uses a trainable 3D CNN to learn the visual features. This is in contrast to previous video ZSL methods, which use pretrained feature extractors. We also extend the current benchmarking paradigm: Previous techniques aim to make the test task unknown at training time but fall short of this goal. We encourage domain shift across training and test data and disallow tailoring a ZSL model to a specific test dataset. We outperform the state-of-the-art by a wide margin. Our code, evaluation procedure and model weights are available at this http URL

arXiv.org e-Print Archive

Crossref

Caltech Authors

Rethinking Zero-shot Video Classification: End-to-end Training for Realistic Applications

Author: Brattoli Biagio
Chalupka Krzysztof
Perona Pietro
Tighe Joseph
Zhdanov Fedor
Publication venue
Publication date: 03/03/2020
Field of study

Unsupervised Behaviour Analysis and Magnification (uBAM) using Deep Learning

Author: Brattoli Biagio
Buechler Uta
Dorkenwald Michael
Filli Linard
Helmchen Fritjof
Ommer Bjoern
Reiser Philipp
Wahl Anna-Sophia
Publication venue
Publication date: 05/04/2021
Field of study

Motor behaviour analysis is essential to biomedical research and clinical diagnostics as it provides a non-invasive strategy for identifying motor impairment and its change caused by interventions. State-of-the-art instrumented movement analysis is time- and cost-intensive, since it requires placing physical or virtual markers. Besides the effort required for marking keypoints or annotations necessary for training or finetuning a detector, users need to know the interesting behaviour beforehand to provide meaningful keypoints. We introduce unsupervised behaviour analysis and magnification (uBAM), an automatic deep learning algorithm for analysing behaviour by discovering and magnifying deviations. A central aspect is unsupervised learning of posture and behaviour representations to enable an objective comparison of movement. Besides discovering and quantifying deviations in behaviour, we also propose a generative model for visually magnifying subtle behaviour differences directly in a video without requiring a detour via keypoints or annotations. Essential for this magnification of deviations even across different individuals is a disentangling of appearance and behaviour. Evaluations on rodents and human patients with neurological diseases demonstrate the wide applicability of our approach. Moreover, combining optogenetic stimulation with our unsupervised behaviour analysis shows its suitability as a non-invasive diagnostic tool correlating function to brain plasticity.Comment: Published in Nature Machine Intelligence (2021), https://rdcu.be/ch6p

arXiv.org e-Print Archive

ZORA

Improving Spatiotemporal Self-Supervision by Deep Reinforcement Learning

Author: Brattoli Biagio
Büchler Uta
Ferrari Vittorio
Hebert Martial
Ommer Björn
Sminchisescu Cristian
Weiss Yair
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 30/07/2018
Field of study

Open Access LMU

OCELOT: Overlapped Cell on Tissue Dataset for Histopathology

Author: Aaron Valero Puche
Biagio Brattoli
Chan-Young Ock
Donggeun Yoo
JaeWoong Shin
Jeongun Ryu
Jinhee Lee
Kyunghyun Paeng
Mohammad Mostafavi
Seonwook Park
Soo Ick Cho
Sérgio Pereira
Wonkyung Jung
Publication venue: Zenodo
Publication date: 23/03/2023
Field of study

The OCELOT dataset is a histopathology dataset designed to facilitate the development of methods that utilize cell and tissue relationships. The dataset comprises both small and large field-of-view (FoV) patches extracted from digitally scanned whole slide images (WSIs), with overlapping regions. The small and large FoV patches are accompanied by annotations of cells and tissues, respectively. The WSIs are sourced from the publicly available TCGA database and were stained using the H&E method before being scanned with an Aperio scanner. For more details, please check <a href="https://lunit-io.github.io/research/ocelot_dataset/">https://lunit-io.github.io/research/ocelot_dataset/</a>.   Before downloading the dataset, please make sure to carefully read and agree to the Terms and Conditions at (https://lunit-io.github.io/research/ocelot_tc/). Also, please provide 1. name, 2. e-mail address, 3. organization/company name.   ----------------------------------------------------------------------------------- Release note. In version 1.0.1, we exclude four test cases (586, 589, 609, 615) due to under-annotated issue. In version 1.0.0, we include images and annotations of validation and test splits. In version 0.1.2, we modified the coordinates of cell labels to range from 0 to 1023 (-1 from the previous coordinates). In version 0.1.1, we removed non-H&E stained patches from the dataset.This dataset is used for OCELOT 2023 challenge (https://ocelot2023.grand-challenge.org/) at MICCAI 2023

arXiv.org e-Print Archive

ZENODO

1308 Artificial intelligence (AI)-powered immune phenotyping based on programmed death ligand 1 (PD-L1) immunohistochemistry (IHC) in triple negative breast cancer (TNBC)

Author: Aaron Valero
Biagio Brattoli
Chan-Young Ock
Changho Ahn
Gahee Park
Jeongun Ryu
Juneyoung Ro
Sangwon Shin
Seonwook Park
Seulki Kim
Seunghwan Shin
Siraj Ali
Soo Ick Cho
Taebum Lee
Publication venue: BMJ Publishing Group
Publication date: 01/11/2023
Field of study

Directory of Open Access Journals